486 research outputs found

    Weighting-Based Treatment Effect Estimation via Distribution Learning

    Full text link
    Existing weighting methods for treatment effect estimation are often built upon the idea of propensity scores or covariate balance. They usually impose strong assumptions on treatment assignment or outcome model to obtain unbiased estimation, such as linearity or specific functional forms, which easily leads to the major drawback of model mis-specification. In this paper, we aim to alleviate these issues by developing a distribution learning-based weighting method. We first learn the true underlying distribution of covariates conditioned on treatment assignment, then leverage the ratio of covariates' density in the treatment group to that of the control group as the weight for estimating treatment effects. Specifically, we propose to approximate the distribution of covariates in both treatment and control groups through invertible transformations via change of variables. To demonstrate the superiority, robustness, and generalizability of our method, we conduct extensive experiments using synthetic and real data. From the experiment results, we find that our method for estimating average treatment effect on treated (ATT) with observational data outperforms several cutting-edge weighting-only benchmarking methods, and it maintains its advantage under a doubly-robust estimation framework that combines weighting with some advanced outcome modeling methods.Comment: 33 pages, 16 tables, 7 figures, Github: https://github.com/DLweighting/Distribution-Learning-based-weightin

    Your Preference or Mine? A Randomized Field Experiment on Recommender Systems in Two-sided Matching Markets

    Get PDF
    The literature on recommender systems mainly focuses on product recommendation where buyer’s preferences are considered. However, for user recommendation in two-sided matching markets, potential matches’ preferences may also play a role in focal user’s decision-making. Hence, we seek to understand the impact of providing potential candidates’ preference in such settings. In collaboration with an online dating platform, we design and conduct a randomized field experiment and present users with recommendations based on i) their own preferences, ii) potential matches’ preferences, or iii) mutual preferences. Interestingly, we find that users are sensitive to the provision of potential candidates’ preferences, and they proactively reach out to those “who might prefer them” despite those candidates’ relatively lower desirability. This leads to a greater improvement in matching. The findings provide valuable insights on how to design user recommendation systems beyond the current practice of recommendations based on focal user’s preferences

    Support Neighbor Loss for Person Re-Identification

    Full text link
    Person re-identification (re-ID) has recently been tremendously boosted due to the advancement of deep convolutional neural networks (CNN). The majority of deep re-ID methods focus on designing new CNN architectures, while less attention is paid on investigating the loss functions. Verification loss and identification loss are two types of losses widely used to train various deep re-ID models, both of which however have limitations. Verification loss guides the networks to generate feature embeddings of which the intra-class variance is decreased while the inter-class ones is enlarged. However, training networks with verification loss tends to be of slow convergence and unstable performance when the number of training samples is large. On the other hand, identification loss has good separating and scalable property. But its neglect to explicitly reduce the intra-class variance limits its performance on re-ID, because the same person may have significant appearance disparity across different camera views. To avoid the limitations of the two types of losses, we propose a new loss, called support neighbor (SN) loss. Rather than being derived from data sample pairs or triplets, SN loss is calculated based on the positive and negative support neighbor sets of each anchor sample, which contain more valuable contextual information and neighborhood structure that are beneficial for more stable performance. To ensure scalability and separability, a softmax-like function is formulated to push apart the positive and negative support sets. To reduce intra-class variance, the distance between the anchor's nearest positive neighbor and furthest positive sample is penalized. Integrating SN loss on top of Resnet50, superior re-ID results to the state-of-the-art ones are obtained on several widely used datasets.Comment: Accepted by ACM Multimedia (ACM MM) 201

    Text Mining Patient-Doctor Online Forum Data from the Largest Online Health Community in China

    Get PDF
    The present study uses the data from the largest online health community in China, www.haodf.com, to examine what are the salient topics that Chinese health consumers discussed with their doctors online. The preliminary research found that there are 146,915 posts by patients and 123,059 posts by doctors from Aug. 2006 to Apr. 2014 on this open online forum. In total, there are 10,685 doctors have participated online forum discussion during this time period. The text mining results on topic modeling are still pending. But we already found the promising and unique quality of this data. We are also looking forward to more inspiring research questions to motivate us for this research

    Leveraging Deep-learning and Field Experiment Response Heterogeneity to Enhance Customer Targeting Effectiveness

    Get PDF
    Firms seek to better understand heterogeneity in the customer response to marketing campaigns, which can boost customer targeting effectiveness. Motivated by the success of modern machine learning techniques, this paper presents a framework that leverages deep-learning algorithms and field experiment response heterogeneity to enhance customer targeting effectiveness. We recommend firms run a pilot randomized experiment and use the data to train various deep-learning models. By incorporating recurrent neural nets and deep perceptron nets, our optimal deep-learning model can capture both temporal and network effects in the purchase history, after addressing the common issues in most predictive models such as imbalanced training, data sparsity, temporality, and scalability. We then apply the learned optimal model to identify customer targets from the large amount of remaining customers with the highest predicted purchase probabilities. Our application with a large department store on a total of 2.8 million customers supports that optimal deep-learning models can identify higher-value customer targets and lead to better sales performance of marketing campaigns, compared to industry common practices of targeting by past purchase frequency or spending amount. We demonstrate that companies may achieve sub-optimal customer targeting not because they offer inferior campaign incentives, but because they leverage worse targeting rules and select low-value customer targets. The results inform managers that beyond gauging the causal impact of marketing interventions, data from field experiments can also be leveraged to identify high-value customer targets. Overall, deep-learning algorithms can be integrated with field experiment response heterogeneity to improve the effectiveness of targeted campaigns
    • …
    corecore